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1. INTRODUCTION 

Autism Spectrum Disorder (ASD) is a syndrome that adversely affect a child where the behavioral 
symptoms start to appear during the first year of life [1]. This early childhood onset includes symptoms such 
as lack in social interaction and very slow language skills development as stated by researchers [2-3]. A 
continuous characters and behavioral assessment is conducted by specialist in order to detect autistic 
presence in a child. A documented analysis done by pediatrics stated that, an autistic child at approximately 
24 months, are still unable to produce two meaningful word that do not involve imitating and repeating. 
Despite so much research being conducted, the exact factors to why this disorder occurs remain unanswered. 
As of why this atypical behavior is very difficult to detect is maybe due to the barely noticeable changes of 
the primary neural impairment itself. 

The relation of ASD with EEG signal is there is a significant decline of EEG complexity perceived 
in autistic child. The noteworthy differences were observed between brain region in the right hemisphere and 
the central cortex represented by [4]. EEG signals are electrical voltage triggered on the electrodes by brain 
electromagnetic signals (BEMS) [5]. At the hospital, a specialized technician’s measures, marks and put 
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about 16-25 electrodes on the patients. Many research shows that the EEG signals, which consists of Alpha, 
Beta, Gamma, Delta and Theta, of ASD child is not the same as compared to non-ASD child [6]. 

Deep Learning is a subset of machine learning. As mentioned by researchers [7-8], deep learning 
includes the use of a lot of hidden neurons and layers, usually more than two, as an architectural benefit, in 
addition with new training platform; when data availability increases, deep learning systems can develop 
gradually and fill in the gaps where human interpretation is not possible. Regardless it being machine or deep 
learning, both consists of supervised learning and unsupervised learning. Supervised learning is developing 
predictive method based on input and output data. While unsupervised learning is grouping and interpreting 
data based on input data only. However, the study conducted by [9] mentioned that deep learning can be an 
excellent tool to be applied in clinical brain imaging. This is further supported by studies conducted by [10] 
stating that deep learning is a very reliable method to quickly address patient health condition by classifying 
of diseases, delete redundancy but at the same time maintains correctness and precision of disease 
identification. Eventually, the process of detecting becomes diseases less time consuming. Health care 
personnel can be certain on their diagnosis in the decision-making stage. 

This paper is organized as follows: The next section describes the proposed approach and the 
database used. The results and analysis are shown in Section 3. Finally, Section 4 concludes this paper. 

The number of children detected with Autism Spectrum Disorder (ASD) has been increasing in the 
past few years as stated by the Ministry of Health, Malaysia. Attention has been brought to this 
neurodevelopmental impairment disease due to the unknown cause of this disorder. Diagnosing ASD before 
the age of three is very challenging. This is because; ASD is highly associated with either the over-abundance 
or very low neuron connection of the brain wires. However, the formation of either the over-abundance or 
very low neuron connection during child growth is a very slow progress that it is hardly noticeable [11]. 
The situation is much more alarming since there is not yet a quantifiable technology used to exactly address 
this disorder. 

ASD is not exactly genetically related, thus specialist cannot exactly predict if the child is autistic or 
not since birth. Many real-life cases show that if a child has autistic siblings that will not necessarily mean 
the child will be autistic as well. Furthermore, according to Mythili [12], each child with autism shows very 
distinguishable behavioral symptoms. For example, some child is very well conversed, while some barely 
utters words. Some child with autism is very much attached and some just did not know how to express 
emotions. Some autistic child even has the same behavior as normal child with very minimal behavior 
difference. There is a very thin barrier that separates the normal with autistic. Thus, these small barriers of 
behavior differences are the main challenges that keep researchers’ attention to this syndrome. 

Electroencephalogram (EEG) is a method used in purpose to find out the brain function regardless 
the mental states of the individuals [13]. EEG provides robust parameters where it can examine the brain 
activity at rest state and active state. EEG can also show which part of the brain is active when doing specific 
tasks and the effect of those specific tasks. There are five types of frequency bands associated with EEG 
signal namely the alpha, beta, gamma, theta and delta. As aforementioned each of this brain wave has their 
own frequency range and each is associated with its own physiological characteristics which governs its own 
cognitive properties. During EEG recording, patient will be shown series of simulation while sitting down. 
All these simulations have a certain way of affecting all the five frequency bands to stimulate. In child with 
ASD, all these five types of frequency band are interrupted and does not show result in accordance with the 
given simulation. 

The Alpha band is known to be available when a person is in relaxed mood and are awake. They are 
also associated with timing and cognitive inhibition. Next, the beta bands are engaged with alertness, active 
task involvement and motor behavior. The gamma band is an early sensory respond, where it helps with 
feature binding in sensory processing. Finally, the delta band is active during deep sleep and event that is 
related to slow waves such as detection of attention and wakefulness. The table computes the EEG signal 
waveform of human being is shown in Table 1. While Figure 1 illustrates sample of EEG signal from a 
normal person. 


Table 1. EEG Brain Signal of Normal Human Being [14] 
Level Frequency range | Approximate EEG label 


1 64-128 Hz High Gamma 
2 32-64 Hz Gamma 

3 16-32 Hz Beta 

4 8-16 Hz Alpha 

5 4-8 Hz Theta 

6 0-4 Hz Delta 
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Across many studies, it is shown that a child with ASD is consistent in producing atypical neural 
activity during the frequency band simulation. Hence, EEG signal reveals high hope in seeking more findings 
to localize the exact area inside the brain where the alteration begins. This is further supported by [14], 
stating that by calculating how quick the brain react to visual and audio stimuli will positively impact in 
classifying autism and diagnose the disorder earlier. 

Deep learning combines feature extraction and classification process [15]. Feature extraction is the 
process of producing a dense but highly meaningful digital presentation of someone fundamental biometric 
attribute. In typical machine learning, the important feature needs to be manually extracted and classify using 
one or more classifier algorithm. In order to extract features, there are two most popular methods used which 
are the filter methods and the wrapper methods. An example of filter methods and wrapper methods are chi 
squared test and recursive feature elimination algorithm respectively. As for classifier, most common 
classifiers are Fisher Linear Discriminant Analysis (FLDA), Random Forest and Support Vector Machine. In 
many research, in order to obtain the best classification accuracy, two or more classification algorithm will be 
used [16]. Thus, it is agreed that since deep learning can successfully combine feature extraction and 
classifier together, it can decrease the experimenting and selecting features process. 

One of the most popular deep learning methods is the Convolutional Neural Network, which is also 
called as CNN [17]. As mentioned before, like any other deep learning method, the feature that is needed to 
be extracted does not have to be outlined. The CNN algorithm will automatically extricate the most 
distinguishing characteristics without including the optimization and propagation steps. This is made possible 
even without using feature extractor due to the training of data step present in the CNN algorithm. The 
examples of applications utilized CNN through studies by [18-21]. 

Deep learning particularly CNN is deployed due to the capability of the algorithm to actually 
recognize unique features of electrical brainwave pattern from EEG signal. Many studies have shown success 
in integrating EEG signal into the CNN algorithm. EEG is non-linear and highly intricate signal that records 
important data. This data portrays the differences of one human being to another. Since EEG is very complex 
and non-linear in nature, many linear classifier methods could not accurately detect this signal as EEG. Thus, 
CNN has become increasingly popular and is becoming a highly preferred new technique in signal detection 
and classification. The theory of CNN is mainly inspired by the brain of human being. CNN is a neural 
network that consists of multilayer perceptron (MLP). Each multilayer perceptron serves its own special 
purposes with its own arrangement of execution as shown in Figure 2. In all MLP, there must be an input 
layer which comes from an input data, a minimum of one hidden layer and finally an output layer that will 
predicts the output from the input layer [22]. 
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Figure 1. Sample of EEG signal from normal person 
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There are several layers that are commonly known in CNN such as convolutional layer-which set of 
learnable filters known as kernel and produce output feature maps that will be output for next layer. 
Activation layer simplifies back-propagation and reduce redundancies. Pooling layers is to decrease size of 
feature maps from convolutional layer. Fully connected layer or dense layer is responsible for the output 
predictions and softmax layer which is an activation function that purpose is to find mean square error. This 
CNN layers is executed in cascading manner. Figure 3 shows an example of CNN layers from works by [23]. 
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There are also feature maps, weights, sample size, Adam optimization and epoch that need to be 
determined before training process which is common in deep learning. Most time, when more layers are 
added, the better the learning process which leads to better classification accuracy. For the hidden layers and 
its output layers, an activation function is needed. Some words such as ReLu, dropout and flatten is relatively 
new function that exist in CNN application[24-26]. ReLu is the activation function used in every convolution 
layer, dropout is to prevent overfitting data in fully connected layer and flatten layer that is use to flatten data 
to 1D array only. 


~B-B-D:| 


Convolutional layer 5 
_ Convolutional layer 4 
Convolutional layer 3 
Convolutional layer 2 
Convolutional layer 1 Softmax output 
Input layer 


Fully-connected Fully-connected 
layer 1 layer 2 


Figure 3. The standard CNN Layer [23] 


2. RESEARCH METHOD 

The methodology for this research is divided into three sections. The first section discusses the 
existing database, the second section discusses applied pre-processing method and the final section discusses 
the deep learning design. This system ran on 2.3 GHz Intel Core i5 2.5GHz with an NVIDIA GeForce 
processor that has 610Mb memory. The Python 3.5.2 language, Open Source Computer Vision Library 
(OpenCV) and Tensorflow were used for pre-processing and to design the CNN model. 


2.1. Research flowchart 

This research carried out the design of an algorithm to detect autism. It is integrated by using EEG 
signal. CNN technique will be used in this project as a binary classification. In order to ensure the expected 
result will be obtained, several major steps need to be conduct such as data collections, implementing, 
testing, and troubleshooting. These steps are used to analyze the data and output. Figure 4 shows the steps in 
performing this research. 


Existing Dataset 


Pre-process of signal 


Design of DL Model 


ee: of 


Deep Learning 


No 


Training of model 
Testing of model 


Figure 4. Main development involved 
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2.2. Existing dataset 

The dataset is obtained from existing dataset from other researchers of related paper work. The 
dataset is request formally by email to researches that have done research by using EEG signal which is 
obtained from University King Abdul Aziz, Jeddah, Saudi Arabia. The dataset obtained consists of twenty 
files, 12 normal and 8 for disorders. The dataset was recorded in relaxing state in order to obtain as many 
artifact-free EEG data as possible. The dataset divided into two groups: normal group (twelves healthy 
volunteer subjects aged 9-16 years old) and autistic group (eight with autism traits aged 10-16 years old). The 
EEG signals was acquired at a sampling rate of 256Hz using active electrodes and the active digital EEG 
amplifier and recording system from BCI2000 software. The data acquisition system has 16 channels 
altogether, which are labeled based on the 10—20 international standard acquisition system. 


2.3. Preprocessing of EEG signal 

Once the dataset is obtained, the raw dataset is pre-processed. The raw dataset obtained is in the 
form of matrix where it is arranged as such Channel (electrodes number) x Frequency (Hz) x Time (s). Each 
of the dataset has an equivalent channel and sampling frequency which is 16 electrodes channel and 256Hz 
frequency respectively. However, the time taken, t of the sampling frequency is different for each dataset. For 
example, dataset person Normal#1 is arranged in 16x256x112, which means the sample is taken for 112s. 
For dataset person Normal#2 is arranged in 16x256x95 which indicates that the sample is taken for 95s only. 

All the dataset is pre-processed according to the time taken to record the data. This step is done by 
using MATLAB software. Thus, for person Normal#1, it is pre-processed manually for 112 times and is 
stored in .csv file format. The file is name randomly. For example, at time, t = 1s for person Normal#1, it is 
label as ‘al.csv’, where ‘a’ is to indicate all file for person Normal#1 while ‘1’ indicate the time, t in 1s. At 
time, t =2s, the file is save as ‘a2.csv’. This is done for all dataset. Table 2 shows the random labeling for 
each dataset until it completed for the whole dataset consisted of 20 subjects. 


Table 2. Random labelling of dataset 


Dataset Label 
Normal#1 a 
Normal#2 b 
Autism#1 A 
Autism#2 B 


Next, the all dataset is combined in a file and is named ‘all.csv’ file. This makes up a total of 17,136 
datasets in a file. All dataset is now at 16x256 matrix form. The next step of pre-processing is to label the 
dataset to normal person or autistic patient. For normal person, the label is set to 1 and for autistic patient; it 
is set to 2. At the final column, the data is set to either 1 or 2 accordingly based on its classification on either 
normal or autism. Finally, the data is implemented into a pre-processing algorithm that is used for 
augmentation and removal of noise using random shuffling and white Gaussian noise. 


2.4. Design of deep learning algorithm 

The deep learning model is designed to fit the EEG data in 2D matrix form. The model is shown in 
Figure 5 produced in Python. The proposed deep learning model which use CNN architecture has a total of 6 
layers which consists of three convolutional layers, one flatten layer and two dense (fully connected) layers. 
After each convolutional layer, batch normalization is applied. This model also used Gaussian dropout 
weights. This is because the protocols of transfer learning stated that when CNN is learned from scratch, it 
needs to be started with random Gaussian distributions. Next, Adam optimization is set to 0.0001 with 200 
epochs. Some activation function is also used in this CNN model. The ReLu activation function is used in 
every convolutional layer while the final dense layer used Softmax activation. The following Table 3 shows 
how the total number of dataset is distributed for model training, validation and inference. 


Table 3. Distribution of dataset 


Model Distribution (%) 
Training Set 80% 
Validation Set 10% 
Test Set 10% 
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Figure 5. Design of Deep Learning Model 
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Generally, the predictive model derived is retrieved with a number of evaluation measures. For a 
classification of predicted classes between ASD and Non-ASD, Table 4 shows the possible answer for a test 


case prediction. 


Table 4. Confusion matrix for diagnosis on ASD 


Predicted Class 
Actual Class ASD Non-ASD 
ASD True Positive (TP) False Negative (FN) 
Non-ASD False Positive (FP) True Negative (TN) 


The performance of the classification result is analyzed by the confusion matrix of the test file. The 
confusion matrix is a particular table layout that allows visualization of the performance of an algorithm 


Classification accuracy (1) is one of the common evaluation measures. The number of test cases that have 
been correctly classified can be identified by using this measure. 
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|TP+TN| (1) 
|TP+TN+FP+FN| 


Accuracy(%) = 


3. RESULT & ANALYSIS 
3.1. Training result 

The training process is repeated for five times. Each time the training process ended, the result of 
the training is overwriting and automatically stored in a web file. This web file will be updated accordingly 
when the code training is executed. Thus, when future training is done, the result from previous training is 
used as a basis for that model training. The first model training achieved accuracy of 70%, the second 
training achieved accuracy of 79% and the next three model training achieved a consistent accuracy of +80%. 
When a training accuracy did not increase after several training process, this indicate the stopping criterion. 
The stopping criterion shows that either more dataset need to be added or the CNN model needs to be 
improved in order to achieve higher accuracy. Figure 6(a) and 6(b) shows training progress of the CNN 
using the obtained samples, and the respective fluctuations in accuracy and loss metrics across each 
training epochs. 

As shown in the previous figures, the obtained training accuracy is only 80%. Note that the accuracy 
obtained is based on only twenty datasets and six layers of CNN model. Although convolution to 2D matrix 
form is possible, the result obtained would not be as well as the result as data trained using in a 2D image. 
This is due to the 2D convolution process is not able to seize completely the spatial relationship between 
electrodes. This means that EEG data, or basically signal in general, have small amount of spatial regularities 
for the 2D CNN to derive from. 
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Figure 6. Training progress, (a) Accuracy, (b) Loss 


3.2. Test result 

An algorithm called the inference algorithm is developed to test whether the developed CNN model 
can correctly classify a given EEG signal from a random dataset. The inference algorithm loaded the CNN 
model that is stored in .h5 file. Then an EEG dataset which is either normal or autistic patient EEG is loaded. 
The predicted output is printed. The design of deep learning model in this project did not achieve high 
classification of accuracy, thus predicted output produced is rather inconsistent. The produced result 
sometimes gives a positive result and vice versa. The consistency of the printed result is clearly due to the 
percentage of accuracy obtained during training set. 


4. CONCLUSIONS 

In conclusion, the objectives of this research are achieved with some improvement that could be 
made for future studies. Since there is limited dataset available publicly of EEG brain signal on autism 
patient, the dataset is formally requested from several universities inside and outside the country. The dataset 
tested in this research obtained from the King Abdulaziz University, Jeddah, Saudi Arabia with total of only 
20 persons. The deep learning model using CNN is developed with a total of six layers. As mentioned, each 
layer in CNN plays major role in ensuring the training and learning process successful. The total of data 
training is five times with a consistence accuracy rate. The stopping criterion shows that for this specific 
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CNN model, the maximum achievable accuracy is around eighty percent only. The evaluation of the deep 
learning model shows inconsistency of the printed result which is the direct cause of the low accuracy. 
Although the achieved accuracy is only about 80%, this shows that it has higher potential in developing 
better algorithm by using more complex deep learning model and having more dataset. 

The major contribution of this research towards the society is in producing an alternative method to 
make detection on the presence of autism in a child. The current diagnose method of ASD is very much time 
consuming, thus it is not reliable since studies have shown that ASD child suffers from many side effect of 
ASD since young age such as visual impairment. Furthermore, a fully develop and high accuracy system 
could be employed by the Ministry of Health as a new diagnosing method of ASD to high risk children based 
on EEG. In terms of economic, it can greatly reduce the time taken to detect ASD thus early treatment could 
be planned by pediatrics to provide better health support to the autistic patient. 
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